Rule Induction for Classification of Gene Expression Array Data
نویسندگان
چکیده
Gene expression array technology has rapidly become a standard tool for biologists. Its use within areas such as diagnostics, toxicology, and genetics, calls for good methods for finding patterns and prediction models from the generated data. Rule induction is one promising candidate method due to several attractive properties such as high level of expressiveness and interpretability. In this work we investigate the use of rule induction methods for mining gene expression patterns from various cancer types. Three different rule induction methods are evaluated on two public tumor tissue data sets. The methods are shown to obtain as good prediction accuracy as the best current methods, at the same time allowing for straightforward interpretation of the prediction models. These models typically consist of small sets of simple rules, which associate a few genes and expression levels with specific types of cancer. We also show that information gain is a useful measure for ranked feature selection in this domain.
منابع مشابه
Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملClassification and Biomarker Genes Selection for Cancer Gene Expression Data Using Random Forest
Background & objective: Microarray and next generation sequencing (NGS) data are the important sources to find helpful molecular patterns. Also, the great number of gene expression data increases the challenge of how to identify the biomarkers associated with cancer. The random forest (RF) is used to effectively analyze the problems of large-p and smal...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملThe effect of progesterone treatment after ovarian induction on endometrial VEGF gene expression and its receptors in mice at pre-implatation time
Objective(s): Progestrone is a prequisite for pre-implantation angiogenesis and induce decidual angiogenesis. It is unknown the effect of progestrone administration on the endometrium of hyperstimulated mice at pre-implantation time. Material and Methods: Adult female NMRI mice were divided in three groups [control group, ovarian stimulated group and progestrone treated mice after ovarian stimu...
متن کاملThe Effects of Kainic Acid-Induced Seizure on Gene Expression of Brain Neurotransmitter Receptors in Mice Using RT2 PCR Array
Introduction: Kainic acid (KA) induces neuropathological changes in specific regions of the mouse hippocampus comparable to changes seen in patients with chronic temporal lobe epilepsy (TLE). According to different studies, the expression of a number of genes are altered in the adult rat hippocampus after status epilepticus (SE) induced by KA. This study aimed to quantitatively evaluate changes...
متن کامل